Generating Automatic Keywords for Conversational Speech ASR Transcripts

نویسندگان

  • Hohyon Ryu
  • Matthew Lease
چکیده

While a plethora of conversational speech has been recorded and archived for over a century, it has not been easily accessible due to many technical challenges vs. text and rehearsed speech to be addressed before conversational archives can be effectively searched and used. In this paper, we describe two language modeling methods for automatically assigning keywords to automatic speech recognition (ASR) transcripts, to benefit search and browsing of conversational speech archives. Experiments performed with the English CLEF CL-SR MALACH collection of oral history interviews. In comparison to a prior baseline generating 20 keywords per conversation segment, we use 1/20th the training data yet improve Recall@20 in matching manual keywords. However, while indexing of manual keywords yields improved search accuracy, indexing automatic keywords (ours or the baseline) fails to improve search accuracy, evidencing the need for additional research.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic Model Training with Detecting Transcription Errors in the Training Data

As the target of Automatic Speech Recognition (ASR) has moved from clean read speech to spontaneous conversational speech, we need to prepare orthographic transcripts of spontaneous conversational speech to train acoustic models (AMs). However, it is expensive and slow to manually transcribe such speech word by word. We propose a framework to train an AM based on easy-to-make rough transcripts ...

متن کامل

A lightweight keyword and tag-cloud retrieval algorithm for automatic speech recognition transcripts

The Fraunhofer IAIS AudioMining system for vocabulary independent spoken term detection is able to provide automatic speech recognition (ASR) transcripts for audio-visual data. These transcripts can be used to search for information, e.g., in audio-visual archives. We experienced difficulties in the process of browsing for desired content when only these transcripts are given, especially since ...

متن کامل

Automatic Recognition of Emotionally Coloured Speech

Emotion in speech is an issue that has been attracting the interest of the speech community for many years, both in the context of speech synthesis as well as in automatic speech recognition (ASR). In spite of the remarkable recent progress in Large Vocabulary Recognition (LVR), it is still far behind the ultimate goal of recognising free conversational speech uttered by any speaker in any envi...

متن کامل

The fifth 'CHiME' Speech Separation and Recognition Challenge: Dataset, task and baselines

The CHiME challenge series aims to advance robust automatic speech recognition (ASR) technology by promoting research at the interface of speech and language processing, signal processing, and machine learning. This paper introduces the 5th CHiME Challenge, which considers the task of distant multimicrophone conversational ASR in real home environments. Speech material was elicited using a dinn...

متن کامل

An empirical analysis of word error rate and keyword error rate

This paper studies the relationship between word error rate (WER) and keyword error rate (KER) in speech transcripts and their effect on the performance of speech analytics applications. Automatic speech recognition (ASR) systems are increasingly used as input for speech analytics, which raises the question of whether WER or KER is the more suitable performance metric for calibrating the ASR sy...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013